AITopics | cooking action

Collaborating Authors

cooking action

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

Deep Learning-Driven Multimodal Detection and Movement Analysis of Objects in Culinary

Ishat, Tahoshin Alam, Qayum, Mohammad Abdul

arXiv.org Artificial IntelligenceSep-19-2025

Abstract--This research investigates the opportunity of an intelligent, multi-modal AI system interpreting visual,audio and motion based data to analyse and comprehend cooking recipes. The system is integrated with object segmentation, hand motion classification and auido to text convertion with help of natural language processing to create a comprehensive pipeline that imitates human level understanding of kitchen tasks and recipies. The early stages of the project involved experimenting with Pre-made dataset, specially COCO dataset for object segmentation, which yielded suboptimal for use case of the project. T o overcome this, a domain-specific dataset was curated by collecting and annotating over 7,000 kitchen-related images, later augmented to 17,000 images. Several YOLOv8 segmentation models were trained on this dataset to detect 16 essential kitchen objects. Additionally, short-duration videos capturing cooking actions were collected and processed using MediaPipe to extract hand, elbow, and shoulder keypoints. These were used to train an LSTM-based model for hand action classification and incorporated Whisper, a audio-to-text transcription model and leverage a large language model such as TinyLlama to generate structured cooking recipes from the multi-modal inputs. A. Background and motivation In the era of computer vision and automation of every crucial task in our day to day life is also being infiltrated by artificial intelligence and machines.

accuracy, machine learning, natural language, (18 more...)

arXiv.org Artificial Intelligence

2509.00033

Country: Asia > Bangladesh > Dhaka Division > Dhaka District > Dhaka (0.04)

Genre:

Research Report (0.64)
Workflow (0.46)

Technology:

Information Technology > Artificial Intelligence > Vision (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Real-World Cooking Robot System from Recipes Based on Food State Recognition Using Foundation Models and PDDL

Kanazawa, Naoaki, Kawaharazuka, Kento, Obinata, Yoshiki, Okada, Kei, Inaba, Masayuki

arXiv.org Artificial IntelligenceOct-6-2024

Although there is a growing demand for cooking behaviours as one of the expected tasks for robots, a series of cooking behaviours based on new recipe descriptions by robots in the real world has not yet been realised. In this study, we propose a robot system that integrates real-world executable robot cooking behaviour planning using the Large Language Model (LLM) and classical planning of PDDL descriptions, and food ingredient state recognition learning from a small number of data using the Vision-Language model (VLM). We succeeded in experiments in which PR2, a dual-armed wheeled robot, performed cooking from arranged new recipes in a real-world environment, and confirmed the effectiveness of the proposed system.

broccoli, ingredient, recipe, (16 more...)

arXiv.org Artificial Intelligence

doi: 10.1080/01691864.2024.2407136

2410.02874

Country:

Asia > Japan > Honshū > Kantō > Tokyo Metropolis Prefecture > Tokyo (0.14)
Asia > Japan > Honshū > Chūbu > Ishikawa Prefecture > Kanazawa (0.04)
South America > Uruguay > Maldonado > Maldonado (0.04)
Europe > Italy > Lazio > Rome (0.04)

Genre: Research Report > New Finding (0.89)

Technology:

Information Technology > Artificial Intelligence > Robots (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (0.68)

Add feedback

PizzaCommonSense: Learning to Model Commonsense Reasoning about Intermediate Steps in Cooking Recipes

Diallo, Aissatou, Bikakis, Antonis, Dickens, Luke, Hunter, Anthony, Miller, Rob

arXiv.org Artificial IntelligenceJan-12-2024

Decoding the core of procedural texts, exemplified by cooking recipes, is crucial for intelligent reasoning and instruction automation. Procedural texts can be comprehensively defined as a sequential chain of steps to accomplish a task employing resources. From a cooking perspective, these instructions can be interpreted as a series of modifications to a food preparation, which initially comprises a set of ingredients. These changes involve transformations of comestible resources. For a model to effectively reason about cooking recipes, it must accurately discern and understand the inputs Figure 1: A graphical depiction of the PizzaCommonsense and outputs of intermediate steps within the underlying motivation. Models are required to recipe. Aiming to address this, we present a learn knowledge about the input and output of each intermediate new corpus of cooking recipes enriched with step and predict the correct sequencing of descriptions of intermediate steps of the recipes these comestibles given the corresponding instructions that explicate the input and output for each step.

cracker, instruction, recipe, (17 more...)

arXiv.org Artificial Intelligence

2401.0693

Country:

North America > United States > Massachusetts > Middlesex County > Cambridge (0.04)
Europe > United Kingdom > England > Greater London > London (0.04)
Europe > Ireland > Leinster > County Dublin > Dublin (0.04)

Genre:

Workflow (0.66)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.71)
Information Technology > Artificial Intelligence > Representation & Reasoning > Commonsense Reasoning (0.50)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.48)

Add feedback

Cook-Gen: Robust Generative Modeling of Cooking Actions from Recipes

Venkataramanan, Revathy, Roy, Kaushik, Raj, Kanak, Prasad, Renjith, Zi, Yuxin, Narayanan, Vignesh, Sheth, Amit

arXiv.org Artificial IntelligenceJun-1-2023

As people become more aware of their food choices, food computation models have become increasingly popular in assisting people in maintaining healthy eating habits. For example, food recommendation systems analyze recipe instructions to assess nutritional contents and provide recipe recommendations. The recent and remarkable successes of generative AI methods, such as auto-regressive large language models, can lead to robust methods for a more comprehensive understanding of recipes for healthy food recommendations beyond surface-level nutrition content assessments. In this study, we explore the use of generative AI methods to extend current food computation models, primarily involving the analysis of nutrition and ingredients, to also incorporate cooking actions (e.g., add salt, fry the meat, boil the vegetables, etc.). Cooking actions are notoriously hard to model using statistical learning methods due to irregular data patterns - significantly varying natural language descriptions for the same action (e.g., marinate the meat vs. marinate the meat and leave overnight) and infrequently occurring patterns (e.g., add salt occurs far more frequently than marinating the meat). The prototypical approach to handling irregular data patterns is to increase the volume of data that the model ingests by orders of magnitude. Unfortunately, in the cooking domain, these problems are further compounded with larger data volumes presenting a unique challenge that is not easily handled by simply scaling up. In this work, we propose novel aggregation-based generative AI methods, Cook-Gen, that reliably generate cooking actions from recipes, despite difficulties with irregular data patterns, while also outperforming Large Language Models and other strong baselines.

cooking action, large language model, machine learning, (19 more...)

arXiv.org Artificial Intelligence

2306.01805

Country:

North America > United States > South Carolina > Richland County > Columbia (0.15)
North America > United States > Colorado > Boulder County > Boulder (0.04)

Genre: Research Report > New Finding (0.34)

Industry:

Health & Medicine > Consumer Health (1.00)
Education > Health & Safety > School Nutrition (0.48)
Government > Regional Government > North America Government > United States Government (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning > Generative AI (0.74)

Add feedback

Visual Recipe Flow: A Dataset for Learning Visual State Changes of Objects with Recipe Flows

Shirai, Keisuke, Hashimoto, Atsushi, Nishimura, Taichi, Kameko, Hirotaka, Kurita, Shuhei, Ushiku, Yoshitaka, Mori, Shinsuke

arXiv.org Artificial IntelligenceSep-13-2022

We present a new multimodal dataset called Visual Recipe Flow, which enables us to learn each cooking action result in a recipe text. The dataset consists of object state changes and the workflow of the recipe text. The state change is represented as an image pair, while the workflow is represented as a recipe flow graph (r-FG). The image pairs are grounded in the r-FG, which provides the cross-modal relation. With our dataset, one can try a range of applications, from multimodal commonsense reasoning and procedural text generation.

annotation, machine learning, natural language, (17 more...)

arXiv.org Artificial Intelligence

2209.0584

Country:

Asia > Japan > Honshū > Kansai > Kyoto Prefecture > Kyoto (0.05)
North America > United States > New York > New York County > New York City (0.04)

Genre:

Workflow (0.88)
Research Report (0.64)

Technology:

Information Technology > Artificial Intelligence > Natural Language (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (0.95)

Add feedback

Affect Sensing in Metaphorical Phenomena and Dramatic Interaction Context

Zhang, Li (Teesside University)

AAAI ConferencesJul-19-2011

Metaphorical interpretation and affect detection using context profiles from open-ended text input are challenging in affective language processing field. In this paper, we explore recognition of a few typical affective metaphorical phenomena and context-based affect sensing using the modeling of speakers’ improvisational mood and other participants’ emotional influence to the speaking character under the improvisation of loose scenarios. The overall updated affect detection module is embedded in an AI agent. The new developments have enabled the AI agent to perform generally better in affect sensing tasks. The work emphasizes the conference themes on affective dialogue processing, human-agent interaction and intelligent user interfaces.

artificial intelligence, machine learning, natural language, (21 more...)

AAAI Conferences

Twenty-Second International Joint Conference on Artificial Intelligence

Country:

North America > United States > New York (0.04)
Europe > United Kingdom > England > Staffordshire (0.04)
Europe > United Kingdom > England > North Yorkshire > Middlesbrough (0.04)
Europe > Spain > Canary Islands > Gran Canaria (0.04)

Industry: Education (1.00)

Technology:

Information Technology > Human Computer Interaction (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Natural Language (1.00)
(2 more...)

Add feedback